Noun Phrase Translations for Cross-Language Document Selection
نویسندگان
چکیده
This paper presents results for the CLEF interactive CrossLanguage Document Selection task at the UNED. Two translations techniques were compared: the standard Systran translations provided by CLEF organizers as baseline, and a phrase-based pseudo-translation approach that uses a phrase alignment algorithm based on comparable corpora. The hypothesis being tested was that noun phrase translations could serve as summarized information for relevance judgment without compromising the precision of such judgments. In addition, we wanted to have an indirect measure of the quality of our phrase extraction process, that had been previously developed for an interactive CLIR application. The results of the experiment con rm that the hypothesis is reasonable: a set of 8 monolingual Spanish speakers judged English documents with the same precision for both systems, but achieved 52% more recall using phrasal translations than using full Systran translations.
منابع مشابه
Noun phrases as building blocks for cross-language Search Assistance
This paper presents a Foreign-Language Search Assistant that uses noun phrases as fundamental units for document translation and query formulation, translation and refinement. The system (a) supports the foreign-language document selection task providing a cross-language indicative summary based on noun phrase translations, and (b) supports query formulation and refinement using the information...
متن کاملExperiments with a Noun-Phrase driven Statistical Machine Translation System
This paper presents a noun phrase driven two-level statistical machine translation system. Noun phrases (NPs) are used as the unit of decomposition to build a two level hierarchy of phrases. English noun phrases are identified using a parser. The corresponding translations are induced using a statistical word alignment model. Identified noun phrase pairs in the training corpus are replaced with...
متن کاملUsing Noun Phrase Heads to Extract Document Keyphrases
Automatically extracting keyphrases from documents is a task with many applications in information retrieval and natural language processing. Document retrieval can be biased towards documents containing relevant keyphrases; documents can be classified or categorized based on their keyphrases; automatic text summarization may extract sentences with high keyphrase scores. This paper describes a ...
متن کاملCollocational Clashes in the Persian Translations of Tuesdays with Morrie
This study aimed at finding features of collocational deviations in the translations of Tuesdays with Mor- rie. In this direction, categories of collocations and collocational clashes, as well as causes of collocation- al clashes were explored. The present work investigated five Persian translations of the novel. All the books were examined completely and all possible collocational clashes were...
متن کاملA Note on Mandarin Possessives, Demonstratives, and Definiteness
Yang (2004) observes that in Mandarin, an initial possessor phrase (PossessorP) may be followed by a bare noun as in (1), or by a possessee phrase that can be headed by a numeral and classifier, [Numeral + CL + N], as in (2) or by a demonstrative, [Dem + (Numeral) + CL + N] as in (3). (In all the examples in this section, we begin with Yang’s own initial glosses and translations. The interpreta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001